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ABSTRACT 

Most current information processing theories of 
cognition and memory share one common feature: the structure 
(state-space) of memory is fixed and retrieval from memory involves 
searching through that structure. Learning^ where it is treated at 
ally involves transforming one such structure into another. This form 
of representation is guestioned and the structural learning theory is 
proposed to take its place. In comparison , the latter theory has a 
flexible structure and is shown to have greater power and parsimony^ 
particularly regarding individual differences and learning. 
Supporting data and relationships with research in artificial 
intelligence and computer siisulation of problem solving are also 
discussed . (Author) 
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THE STRUCTURE OF MEMORY: FIXED OR FLEXIBLE? 
ABSTRACT 

Most current informatiai proces£4.ng theories of cognition and memory 
share one common feature: the structure (state-space) of memory is fixed 
and retrieval from memory involves searching through that structure. 
Learning, where it is treated at all, involves transforming one such struc- 
ture into another. This form of representation is questioned and the struc- 
tural learning theory is proposed to take its place. In comparison, the 
latter theory has a flexible structure and is shown to have greater power 
and parsimony, particularly regarding individual differences and learning. 
Supporting data and relationships with research in artificial intelligence 
and computer simulation of problem solving are also discussed. 
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THE STRUCTURE OF MEMORY: FIXED OR FLEXIBLE? 
Jobeph Scandura 
University of Pennsylvania 
The view that memory is structured goes back to the old gestaltist 
notion of grouping. It also finds realization in the notion of associative 
network. • In more recent timeti, memory theorists have borrowed freely from 
computer science, particularly from the areas of computer simulation and to 
a lesser extent from the more behaviorally neutral area of artificial intelli- 
gence . 

In spite of the great variety which exists among current information 
processing theories, all such theories share one common feature: the struc- 
ture of memory is fixed and retrieval from memory involves searching through 
that structure. Learning, where it is treated at all, involves the transfor- 
mation of an existing structure into a new one. 

In the present article, this form of repruocntation is questioned. 
Th(? first section introduces the notion of a state space (equivalently , problem 
space, or relational net) and shows how a variety of prominent memory theories 
are variants on the common theme. In section two, the structural learning 
theory' is reviewed, together with some closely related empirical research. 
Finally, relationships between the structural learning theory and relational 
net theories are discussed and an attempt is made Co answer the question in the 
title: is the structure of memory fixed of flexible? 
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Relational Net Theories of Memory and Cognition 

State Spaces 

The notion of state space is very general and has been widely used 
as a basis for representing a variety of theories involving both computer and 
human information processing. State spaces consist of two kinds of elements, 
states and operators. In psychological terms, states refer to (encoded) en- 
tities of various sorts (e.gr, nonsense syllables, words, concepts, even rela- 
tions). Operators refer to actions which map given states into other states. 

State spaces may be represented as shown in Figure 1 by directed graphs 
in which the nodes refer to states and the arrows to operators. 



INSERT FIGURE 1 ABOUT HERE 



Examples of state spaces range from associative networks among common nouns 
(Bower, 1972) to directed graphs representing the possible stages through which 
a prculem solver might go (Newell & Simon, 1972). The typical state space in 
problem solving, for example, allows for available operators to act on nodes 
in all possible ways; psychologically a state space may be thought of as the 
totality of possible paths among the various states. 

In particular applications one or more states must be singled out as 
starting states, and a goal (G) is defined. Goals may be defined in terms of 
specific states or in terms of properties which specify a class of states 
(e.g., "is a check-mate position" in chess). 

To achieve a given goal in this view, thx? subject must find a solution 
path from a starting state to a goal state. (To actually satisfy a goal, of 
course, the operators in the path must be applied successively to some starting 
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state(s).) One general approach to finding a solution path is to systematically 

and exhaustively try out all possible routes, either beginning at a starting state 

or at a goal state. In breadth first methods (for details, see Nilsson, 1971), 

all operators emanating from a given node are tried first before the outputted 

states are expanded. In depth first methods, states furthest removed from tlie 

starting state are expanded (until some predetermined depth is reached) before 

new states are expanded. 

Heuristic search methods, on the other hand, attempt to expand promising 

alternatives first and do not necessarily try out all possibilities. Consider, 

for example, the cryp to-arithmetic task 

DONALD 
+ GERALD 
ROBERT 

in which the task, is to assign digits to the letters so that the two resulting 
addends sum to the third numeral (see Bartlett, 1958 or Newell & Simon, 1972). 



INSERT FIGURE 2 ABOUT HERE 

An exhaustive search of the space might move (in a depth first manner) 
until each letter has been assigned a value- These assignments then would be 
checked to see: (a) if the letters are paired with the digits in a one-two-one 
manner, and (b) the assignments satisfy the indicated addition requirement. 
A more heuristic method, suggestive of human behavior, would be to check the 
one-to-one and addition requirements as each new digit value is assigned. For 
example, once A and 5 are assigned to T and N , respectively, 4 and 5 are no 
longer valid candidates for L, 
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Many refinements of state space representations and search methods have 
been proposed, of course, but the essentials remain as described: the possible 
states and operators are represented in terms of a relational net (state-space) 
and search methods are devised for finding paths between given states. 

Learning (or storing information) in this view involves transforming 
given state spaces into new ones. This may take the form of actually constructing 
a new space or, as we shall see, tagging or in some other way distinguishing 
certain states and operators in the given space. 

Not surprisingly, a wide variety of current models of cognitive beliavior, 
most particularly in problem solving and memory, are increasingly recognized 
as having a good deal in common (e.g., Reitman, 197(0."^ The major difference 
seemis to be one of terminology. In problem solving, the starting states are 
referred to as the "given" information and the goal specifies properties to be 
satisfied by a solution (cf. Polya, 1962)- In retrieval from memory, the start- 
ing states correspond to (external) recall cues and information wliich happens 
to be active in the processer (short-term memory). The goal refers to to-be- 
recalled items. 

Throughout our discussion, problem solving: plays a distinctly secondary 

role and is considered only where this serves to clarify our main argument. 

Memory Theories 

Because associative models of memory appear to be giving 
way to the information processing view, it is perhaps surprising that both kinds 

of models involve state space representations. These models range widely and 

deal with the free recall of unorganized nouns (e.g.. Bower, 1972), the semantic 

structure of memory (e.g., Rumelhart , Linvdsay , & Norman, 1972 ; Kintscii, 1972; 

Quillian, 1968; Collins S Quillian, 1972), the structure of paragraphs (e.g., 

Crotliors, 1972), implication (e.g., Frederickson , 1972), and patterned sequences 

of symbols (e.g., Simon, 1972; Restle, 1970; Glanzer & Clark, 1963; Vitz & Todd, 

ERIG). 
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Anderson's model FRAN (reported in Bower, 1972) provides perhaps the 
most clearly defined associative model in this sense • This theory apparently 
deals successfully with the free recall of unorganized (non-categorized) nouns. 

In FRAN, the initial data base (state space) consists of 262 concepts 

(nouns), each having between 3 and 19 associative connections with the others 

(determined from Webster's dictionary). The data base in this case may be 

thought of as representing the associative connections that the population of 

subjects might conceivably have learned. Particular (sub)lists of nouns are 

learned in accord with associative principles. On each trial, an attempt is 

made to tag (i,e,, activate) the presented noun, and pathways emanating from 

this noun are searched for other nouns in the list to be learned. Where such 

2 

pathways are found, they are marked with a LIST tag. According to Bower, the 
effect of such markings is to direct the executive (search method) during retrie- 
val toward marked pathways leading from given nouns to ethers to be recalled. 
In common with other associative theories, the marking of nodes and pathways 
(i.e. , learning them) is assumed to be a probabilistic process increasing linear* 
ly with study time per item. 

Anderson and Bower assume that between two and four items are held in 
short-term memory (STM) , together with newly presented nouns and/or retrieval 
signals. In addition, three or fewer items are assumed to reside in a similar 
store called ENTRY SET. ENTRY SET consists essentially of those nouns that are 
connected to the largest number of associates. In effect, a total of about 
seven items are assumed to be "active'* at any one time, an assumption which has 
become increasingly common in memory theories ever since Miller's (1956) classic 
paper was written. 

Among FRAN's more unique features as an associative model is that items 
are not retrieved independently but depend on the items initially available and 

ERIC 
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on successively retrieved items. During recall, the information processer is 
assumed to respond immediately with the four or so nouns held in short-term 
memory* The short-term memory nouns together with the three on ENTRY SET then 
serve as starting nodes from which to commence a search through the associative 
network* The executive (search) process examines the associative connections 
emanating from these items in a depth first search until a noun is reached from 
which no pathways emanate . The search continues only along learned pathways. 
Nouns at t\,e ends of learned pathways are recalled. 

Although they generally give greater attention to semantic and categori- 
cal features, existing information processing models are also based on state 
space methods. The model by Rumelhart, Lindsay, and Norman (1972) illustrates 
this class as well as any. Here again, the data base is a state space (rela- 
tional net) and retrieval is like running a maze from various starting points 
to others. 

In the Rumelhart et_ al_- model, however, unlike the Anderson-Bower model, 
no formal distinction is made between the data base and processes which operate 
on that base. More immediately relevant here, the nodes in the state space 
consist of concepts (e.g., bird) and actions (e.g. , roll) connected by relations. 
Although Rumelhart et_ al. are not explicit on the point , concepts may be viewed 
as classes, or equivalently as properties of items which define classes. Such 
properties are determined by encoding by insertion inf,o classes (for details , 
see Scandura, 1971). The processes (in tl:e data base) may serve to retrieve 
information in the data base or to modify the data base through learning. These 
processes operate under the constraint of a fixed STM capacity. 

The model also includes an executive interpretat ive process which encodes 
information directly into the data base. The executive, together with certain 
other unspecified primitive routines, are viewed as necessary features of a 

O 
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workable simulation system which are ^Mefined outside of the memory structure 

itself /Rumelhart £t al. , 1972, p, 210//' 

Among the more significant f,T»atures of the model are: (a) the possibility 
3 

of defining secondary nodes (e.g., small bat) in terms of primary ones (i.e., 
small and bat), (b) a taxonomy of rules of formation for (new) relations, 
concepts, propositions (i.e., concepts which express relationships among concepts), 
and operators, (c) explicit processes for forming general concepts from a set of 
examples and for subdividing concepts (e.g., birds that do and do not fly). 

Rumelhart, Lindsay, and Norman (1972) feel that three characteristics 
most distinguish their model from others of the semantic processing variety. 
First, rather than tagging new items as in the Anderson-Bower model, for example, 
the interpreter constructs a list of properties (features) of the items. A 
general feature of the interpreter is that when STM reaches capacity, an attempt 
is made to reorganize its contents into higher level categories and thereby 
reduce the memory load. Second, retrieval is viewed as reconstruction of items 
from remembered characteristics in STM, rather than as searching for connections 
between items. Although this distinction is important conceptually, it should 
be emphasized that it is a direct implication of defining the nodes in the state 
space in terms of properties (classes). Locating to-be-recalled items still 
involves searching through a state space. Failure to retrieve an item in the 
Rumelhart et al. (1972) view, results when not enough characteristics of the item 
have been stored. Third, retrieval is thought to be directed according to ex- 
plicit heiiristic criteria, rather than being relatively non-selective, as with 
the undirected depth first search procedurer* used in FRAN, for example. 

To summarize, a broad range of memory theories conceive of long term 
memory (LTM) as represented by a state space. Storage, or learning, involves 
either tagging items in a relational net, or constructing properties of items, 
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which amounts formally to essentially the same thing since both involve trans- 
forming one relational net into a new one. During retrieval in most such models, 
search begins with the items or properties in STM. From there, a directed or 
undirected search is initiated until the to-be-recalled item is found, or 
failure results. At a formal level, most information processing accounts of 
problem solving have the same general form. In this case, the goal is to find 
a solution path from the given to a problem solution. 

The psychological reality which state space theorists impute to their 
constructs is well summarized by Newell and Simon (1972); 

Human problem solving, we have argued, is to be understood by 
describing the task environment in which it takes place; the space 
the problem solver uses to represent the environment, the task, 
and the knowledge about it that he gradually accumulates; and 
the program the problem solver assembles for approaching the 
task /pp. 867-868/. 

Limitations 

Unfortunately, state space formulations (including Newell 6c Simon's use 

of production systems to represent search methods) have a number of important 

and fundamental limitations. Perhaps the most basic are those pertaining to 

individual differences in the formation of state spaces, and learning. Again 

quoting Newell and Simon(1972): 

Our emphasis has been on the problem solver's performance 
program ... We brought to bear what evidence we could on the 
question of how the problem solver, in the face of a new task, 
generates an appropriate problem space and program and on 
the commonalities and differences among problem solvers. 
Our answers to the-se questions were sketchy, for these areas un- 
doubtedly represent the largest and most important terra incog - 
nita on the map of the theory of human problem solving today 
/pp. 867-S68__/. 

Although the importance of individual differences is well recognized, 
existing state space theories have little more to say about them than the fact 
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that state spaces and processes may vary over individuals. 

In dealing with individual differences, the state space theorist is 
posed with a dilemma^ On the one hand, he may employ a separate state space 
for each subject together with individual processes characteristic of that 
subject. Such an approach, however, would bo antithetical to science. Piaget, 
for one, has recognized this problem and it is primarily for this reason (e.g., 
see Furth, 1969) that he chose to deal with the epistemic subject, rather than 
the individual. 

The alternative is to set up one state space to account for the behavior 
of all subjects (or, at least, for a given class of subjects), together with 
a fixed set of processes. In this case, however, the result will ne'.essarily 
be a theory of averages. Such theories may provide convenient ways of explaining 
and perhaps predicting average performance of groups of individuals , but they cannot 
seriously be used to characterize individual processes (Scandura, 1971). Any 
viable memory theory that purports to deal with individual differences must 
distinguish between those characteristics which are common to all people and 
those which make them unique. 

Existing theories not only fail to deal with individual differences in 
a substantive way but they tend to be geared to particular task environments. 
The model described by Bower (1972) deals with the free recall of unorganized 
lists while that of Rumelhart e£ aJL. (1972) was explicitly designed to deal with 
verbal organization. Both models probably reflect human memory of verbal material 
to some degree since people can obviously deal with both kinds of situation. 
Yet neither model by itself allows for this. In FRAN items are treated as wholes, 
at the same level of abstraction. By st7;essing properties of items, Rumelhart 
et al. get a somewhat more general state space but at the expense of more pro- 
cessing rationality (e.g. , in forming general concepts) than is reasonable or 
necessary in many situations. 

ERIC 
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In state space formulations, it is also unclear what are the mechanisms 
by which state spaces are constructed in the first place. The executive inter- 
pretive system of Rumelhart et_ al . (1972) was designed for this purpose, but 
if it is so important (as it is), why was it kept separate from the memory theory 
itself? Equally important, the processes by which state spaces are modified 
have an ad_ hoc character that are also treated independently. Clearly, there 
are relationships between understanding, storing, learning, and searching for 
infotTnation. Exactly, how are these processes related? VHiat are the differences? 
\^at do they have in common? With the exception of a fixed processing capacity 
assumption, state space theories are strangely silent on these matters. 

The Structural Learning Theory 
Introduction: Competence and the Idealized Theory 
With these questions in mind, let us briefly review the structural learn- 
ing theory (Scandura, 1973) as it pertains to cognition generally, and memoiry 
in particular. 

The structural learning theory consists of three interrelated partial 
theories, each of which must be tested empirically in a different way. First, 
there is a theory of structured knowledge - or, more accurately as we shall see, 
theories of structured competence . These theories deal with the problem of how 
to characterize competence: the competence associated with particular behavior 
constitutes a theory in its own right. The second partial theory brings the 
behaving subject into the picture. It provides a basis (1) for determining the 
knowledge had by particular subjects (relative to a given theory of competence) 
and (2) for telling how thit knowledge is selected for use and how new knowledge 
is acquire.]. This theory is an idealization in the sense that it applies only 
where the subject is unencumbered by memory and his finite capacity for process- 
ing information* The third theory is still more general and tells what happens 

O 
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when memory and information processing capacity are taken into account. These 
three theories build upon one another in a natural way, although research on 
any one can progress independently of the others. 

The observer and the subject both play a fundamental role in the theory, 
corresponding to the above distinction between competence and knowledge. Com- 
petence involves rules introduced by an observer to account for behavior he is 
interested in observing. This behavior, or more exactly, this class of poten- 
tially observable input-output pairs, against which actual behavior is to be 
judged, is predetermined. \^hen the psychologist enters his laboratory, for 
example, he has a pretty good idea ahead of time what stimuli and what responses 
he is interested in. Whether or not the subject wiggles in his chair as he 
elicits the response "^tUR" may not only be unanticipated, but typically will 
also be ignored. Similarly, in testing students to see whether they know the 
subject matter, the professor can usually determine in advance what are the 
stimuli and the corresponding acceptable responses. 

More important, the rule sets introduced by the observer to represent 
competence differ in an important way from standard competence theories in lin- 
guistics (e.g., Chomsky & Miller, 1963), and indeed, from the formal mathematical 
(production) systems (e.g., see Nelson, 1968) on which they are based. 

A simple grammar, for example, consists of a finite set of rules, and is 
said to account for an input-output pair if some sequence of rules in the rule 
set can be found such that the successive application of these rules to the input 
generates the output. This latter point is particularly important because it 
is implicitly assumed that the rules must be combined in a very special way. 
In the structural learning theory rules are allowed to interact in a more general 
manner. 
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To see the difference, suppose, for example, that the given class of 
input-output pairs of interest consists of strings of the form xB By where 
X is string of a's and y is the binary numeral representing the number of a*s 
(e.g., aaaaaB ^ BlOl, aaB BIO). A simple grammar which accounts for this 
class, includes the rules r^^ = xxBy xBOy and r^ = xxaBy xBly. To account 

for the pair aaaaaB BlOl, then, we see that 

r^ r r 

aaaaaB aaBl aBOl BlOl. 

Notice that neither of the given rules is sufficient in itself to account for 
the given pair. It is necessary to assume that the rules may be applied suc- 
cessively as many times as desired. 

An equivalent way of accounting for this class is to explicitly include 
a generalized composition rule ^ in the characterizing rule set, call it 
A = { r^, r^ , }. Accounting for a given input-output pair, in this case, means 
either that there is a rule in A which generates the output on application to 
the input or tliat such a rule may be derived by application of rules in A to 
otiier rules in A. More precisely, we say that A ciccounts for an input-output 
pair if there is a finite number n such that there is a rule in one of the follow- 
ing sets u'hich generates the output from the input. 

A = [ r^, r^, '-'^ } 
2 

A = A u [ ^i^'^i^ ^l'^2' ^I'^'^l' ^Z^^^Z ^ 
3 9 

A = A u { r^^r^^^r, r^'^r^'^r^, r^^r^'^r^ , ... } 

a"' 

Iv'ith respect to tlie above instance aaaaaB-* BlOl, for example^ the rule ^2'^^l'^^2 
Q I A serves this purpose. 
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It is important to emphasize that these two formulations are mathemati- 
cally equivalent insofar as computing power is concerned, so in one sense we 
have nothing new. Mathematical equivalence, however, does not necessarily imply 
behavioral equivalence, or even as I would propose in this case, behavioral 
viability. One way to see this is to observe that the composition rule * is 
just one of any number of different higher order rules that might be included 
in a rule set. Such rules can greatly increase the power of a rule set. For 
example, the higher order rule 

r =^ r, 

a b 

operates on rules involving a's and converts them into corresponding rules in- 
volving b's. Just this one rule doubles the power of the given rule set to 
include an equivalent set of input-output pairs where the inputs involve b's 
instead of a's. More important, every time a new rule involving a's is added 
to the rule set, we automatically get "free," because of this higher order rule, 
a corresponding rule involving b*s. 

In contrast with competence, the term "knowledge" refers to a potential 
for behavior. Knowledge also consists of rules, but these rules are attributed 
to a behaving subject and are thought of as generating behavior. Previous theories 
(e.g., see Piaget in Furth, 1969; Newell & Simon, 1972), in which rule like con- 
structs are attributed directly to behaving subjects, have been essentially non- 
operational. The underlying mechanisms have been difficult if not impossible 
to test empirically. The Piagetian mechanisms of accomodation and assimilation, 
for example, are immune in an important sense to behavioral test because the 
effects of these mechanisms on behavior depend on the knowledge individual sub- 
jects liave when they enter the learning or testing situation. But, Piagetian 
theory itself provides no way of finding out what this (individual) knowledge is. 

ERIC 
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The structural learning theory provides an explicit vay of handling 
this problem. The rules introduced by an observer to account for the behavior 
of interest are used as an instrument of sorts with which to measure 

human knowledge. More specifically, the theory tells how, through a finite 
testing proc€!dure, one can identify which parts of given rules in a competence 
theory individual subjects know - that is, which rules the subjects can perform 
in accordiince with. The rules in a competence theory in a very real sense serve 
as rulers of measurement, and provide a basis for the operational definition of 
human knowledge. It should be noted in this regard that to have behavioral re- 
levance, a rule set must reflect the common culture shared by the population in 
question . 

To briefly review how this is accomplied (for details, see Scandura, 1973), 
we first note the basic assumption on which the theory rests is that people are 
goal directed information processers. Further, rules may be viewed as procedures 
in the sense of computer programs and may be characterized, for example, as flow 
diagrams or labeled directed graphs (see Scandura, 1973). 



INSERT FIGURE 3 ABOUT HERE 

Procedures can always be broken down into simple enough steps so that 
each subject in a given population is able to perform each step perfectly or 
not at all (cf. Supnes , 1969; Scandura, 1973). In short, each component step 
of a procedure may be assumed to act in atomic fashion. The behavioral reality 
of atomic rules has been established, in my opinion, beyond any reasonable 
doubt (e.g., see Scandura, 1969). 

Since each component acts in atomic fashion, each path through a proce- 
dure also acts in atomic fashion. Thai is, each path through a procedure makes 
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it possible to generate responses to a uniquely specified equivalence class of 
stimulus items, and to no others. Furthermore, there are only a finite number 
of such paths, since we do not distinguish paths according to the number of 
repetitions of loops* Collectively, these paths impose a partition on the 
domain of stimuli to which a procedure applies. This makes it possible to pin- 
point through a finite testing procedure exactly what it is that each subject 
knows relative to the initial procedure introduced by the observer. It is suf- 
ficient to test the subject on one item selected randomly from each equivalence 
class. Success on any one item, according to our assumptions, implies success 
on any other item drawn from the same equivalence class, and similarly for failure. 

Knowledge (behavior potential), then, is also represented in terms of 
rules (procedures), specifically in terms of sub-portions of initial, corres- 
ponding competence procedures. It should be emphasized in this regard that the 
knowledge attributed to different individuals may vary even though only one rule 

of competence may be involved. The idea is directly comparable to measuring 

4 

different distances with the same ruler. 

None of this is idle speculation. Scandura & Durnin (1973) and Durnin 
& Scandura (1973) have collected data involving a large number of different 
tasks, with subjects ranging from pre-school children to Ph.D. candidates. 
When run under carefully prescribed laboratory conditions, it was possible to 
predict performance on new items, given performance on initially selected items, 
with over 96% accuracy. When the testing took place under ordinary classroom 
conditions, where the subjects were run as a group, the predictions were accurate 
in about 84% of the cases. 

The structural learning theory also provides a precise set of mechanisms 
bv which the rules available to a subject are put to use, and by which new rules 
are acquired. The basic idea rests on the assumption that human beings are goal 
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directed information processers, and that control shifts among various higher 
and lower level goals automatically in a predetermined manner, according to the 
requirements of the situation. 

For present purposes, we may think of the mechanism informally, operating 
as follows: given a task (stimulus and goal) for which the subject does not 
have a solution rule immediately available, control is assumed to automatically 
switch to the higher level goal satisfied by rules which do apply. With the 
higher level goal in force, the subject assumably selects from among available 
and relevant higher order rules in the same way as he would with any other goal. 
In effect, if the subject has an applicable rule available, then he will use it. 
Where no such higher rules are available , the theory assumes that control moves 
to still higher level goals. Conversely, once a higher level goal has been 
satisfied, control is assumed to revert to the next lower level. 

Assume, for example, that a subject is asked to convert 5 yards into inches, 
but that he does not know explicitly a rule for accomplishing this (e.g., he 
does not know chat there are 36 inches in a yard). Let us assume, however, that 
he does know rules for converting yards into feet and feet into inches, together 
with a higlier order rule which operates on pairs of rules such that the output 
of one serves as the input of the other and generates composite rules. 

In this case, control would be assumed to shift to the higher level goal 
of finding a solution rule. According to the simple performance hypothesis, 
then, the higher order rule is applied to the yards to feet and feet to inches 
rules, generating a composite rule from yards to inches. This composite rule 
satisfies the higher level goal so control reverts to the original goal. Here 
the simple performance hypothesis is used once again and the composite rule is 
applied to solve the problem.^ 
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Again, none of this is idle speculation. Several experiments (Scandura, 

1971, 1973) rather conclusively demonstrate the viability of the analysis, at 

least under the limited conditions tested. One experiment (Scandura, 1973), 

for example, involved the composition higher order rule and simple rules for 

trading objects such as toothpicks for erasers. Alter training on the requisite 

simple rules, naive subjects were either trained or not on the higher order rule. 

Then, they were presented with new pairs of simple rules and tested on problems 

that required corresponding composite rules for their solution. Correct predic- 

6 

tions in this experiment were made in 29 out of 30 individual cases. In a 
somewhat more complex and demanding experiment (Scandura, 1973), each subject 
was required to generalize from a specific rule. Correct predictions were made 
in 50 of 50 cases. 

Extension to Memory 
In the idealized theory it is assumed essentially that the subject has 
a single active memory A, consisting of elements (degenerate rules), simple 
rules, and rules which operate on rules. The contents of this memory, including 
new elements which may be generated in the course of a computation, are assumed 
to be readily and uniformly available to the subject. The absence of a priori 
relations among the rules can be represented as in Figure AA. 



INSERT FIGURES 4A & AB ABOUT HERE 

Experiments have shown that this idealization can be approached in practice (e . g . , 
Levine, 1966; Scandura, 1973) but, of course, this will not be the case in all 
empirical situations. Even familiar information is not always equally easy to 
recall; witness the "tip of the tongue" phenomenon. In general, at any given 
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point in time some knowledge (rules) will be available (aroused) but other 
knowledge will not* 

With this in mind, Scandura (1973) extended the idealized theory by 
distinguishing between a long term memory (M) , consisting of a cumulative 
record of all active elements, and that small part of it (A) which is active 
at any one time (see Fig. 4B) . M is a finite set of rules as before; but only 
rules and (encoded) stimuli in A can p,enerate responses or produce new know- 
ledge. All processing goes on in A. 

In developing the theory, Scandura (1973) found it convenient to distin- 
guish between memory theory where the capacity of A is finite but unbounded 
and where the capa-zity of A is fixed. Clearly, the memory theory with unlimited 
processing capacity is more broadly applicable than the idealized (memory free) 
theory. In particular, the theory applies in situations where certain rules 
are not immediately available in A, even though the subjects may have previously 
learned and stored them (in M) . In testing the theory, the only essential con- 
dition is that the subject not be hampered by his limited capacity for process- 
ing information. This can be accomplished, for example, by providing the 
subject with a pencil and paper, and all the time he needs. 

In the theory, stimulation from the environment that enters A automatically 
becomes part of M. This information remains immediately available to the sub- 
ject, however, only as long as it remains in A. It can be retrieved at a later 
time only if it has been stored (via rules) in relation to other information 
which can serve to cue it. Specifically, storing information involves construct- 
ing rules by which to-be-remembered elements can be generated from other elements 
(that are either given as cues or in A). Retrieving information involves using 
active rules to generate observables from given cues and elements in A (or in 
the environment) . 
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The basic mechanisms of the memory theory with unlimited processing 
capacity are a direct extension of those for the idealized theory. In retrieval, 
for example, control may shift among goal levels as before. The relatively snail 
number of rules in A, however, serves to keep within strict bounds the number 
of rules that must be tested at each stage. VJhero desired rules cannot be de- 
rived or retrieved solely from rules in A, or in the environment, control shifts 
so as to activate (i.e., derive or retrieve) rules which do make this possible. 
For example, it is reasonable to assume tliat some of the rules needed in a deri- 
vation, particularly those on which a given rule might operate, may not be active 
(in A). In this case, the mechanism allows control to shift automatically to 
what are called domain goals. Domain goals are satisfied by rules in the domains 
of corresponding available (higher order) rules. Once needed domain information 
is activated, through derivation or retrieval, control returns to the goal from 
which the secondary domain mechanism was initiated, and the process continues. 

To date, only one series of experiments has been run to test the memory 
theory with unlimited capacity. This research vms concerned with the behavior 
of individual subjects in particular situations, and involved a demanding new 
paradigm in which a major task was to insure that the experimental conditions 
accurately and completely reflected the proposed theoretical requirements. In 
each experiment the overall results strongly supported the theoretical mechanism 
under study. There were some minor perturbations in the data of a f'^^w indivi- 
duals in the earlier experiments, however, which led to methodological refine- 
ments that were required by the theory hut had originally been overlooked. 

In Experiment I, the author's assistant rnade certain modifications in the 
procedure that were not caught until after 32 subjects had been run. Since they 
appeared to be minor and could very easily be made by anyone running such an 
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experiment for the first time, it is instructive to consider this experiment 
in detail. ^ 

Experiment I 

Method 

Materials. The experimental material was similar to that used in an 
experiment by Scandura and Ackler (reported in Scandura, 1973). These materials 
consisted of sets of small items such as paper clips and rubber bands which were 
used in making trades wi th the experimenter . In addition, there were cards 
each of which described a rule for trading n stimulus objects for n -H m or 
n - m response objects. On the back of each card was a symbol designated as 
the ''name'* of the card. There was also a chart which could be used for locating 
rule cards by namp for making specific trades. 

The cards were used to designate two kinds of rules, simple and composite. 
Simple rules affected trades directly and were represented on 5 x 8 inch cards. 
The card at the top of Figure 5 designates a simple rule which maps n paper 
clips into n -I- 1 blue chips. 



INSERT FIGURE 5 HERE 



The card at the bottom of Figure 5 designates a composite rule for changing 
pencils into paper clips, and then paper clips into white chips. 

A pair of simple rules is said to be compatible if the output of one matches 
the input of the other. Compatible rules can be combined to form composite rules. 
For example, the rules n pencils n -H 2 paper clips (A^B) and n paper clips-* 
n + 1 white chips (B-*C) can be combined to form a composite rule (A-^-<]) which 
maps n pencils into n 3 white chips. The set of compatible simple rules com- 
prises the domain of a higher order composition rule which maps such pairs into 
corresponding composite rules. 
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The chart was a 9 x 9 table in which the entries were names of rules 
for converting row elements (e.g., pencils) into column elements (e.g., paper 
clips). The main diagonal and all entries below and to the left of it were 
blank. Thus, no element could be traded for itself, and no rule had an inverse. 
For example, there was a rule for trading pencils for paper clips but none for 
trading paper clips for pencils. The chart could be used to identify rules 
not immediately available. (This corresponded to retrieval). 

Tasks. There were four different kinds of tasks. The first were direct 
trading tasks in which the subject was given a simple or composite rule card and a 
set of stimulus objects. He was asked to make the trade indicated on the card. 

The higher order rule was used to define a second higher order (H) task 
in which the subject was presented with a compatible pair of simple A^B and 
Br*C rules and a set of stimulus objects A. The goal was to trade the given 
A objects for the output (C) of the simple B-*C rule (that did not involve the 
A objects). This could be accomplished by first deriving the necessary composite 
rule and then applying it. The composite rules (cards) could be derived by 
rearranging the given pairs of compatible simple rule cards to form composite ones. 

In the third, domain (D) , task the subject was given a simple B—C rule (e.g., 
n paper clips n + 1 white chips) and a set of A items not represented on the card 
(e.g., pencils). Furthermore, the column on the chart wliich corresponded to the 
rule output C (e.g., white chips) was covered so that it was not possible to name 
the rule which converted pencils directly into white chips. The goal was to 
find a pair of compatible rules (including the given rule) in the domain of the 
higher order composition rule. The inputs of the derived simple rule had to 
match the given A objects (e.g., pencils), and the outputs had to match the inputs (B) 
of the given card (e.g., paper clips). For example, given pencils and the rule 
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n paper clips n + 1 white chips, the subject had to use the chart to find the 
rule which traded pencils for paper clips. The (domain) rule for accomplishing 
this involved locating the row and column of the table corresponding respectively 
to the domain and range of the desired rule. The entry in each row and column 
was the name of the indicated rule. 

The fourth task was a composite HD task. The stimulus situation was 
identical to that in the domain task, but the goal, for example, was to trade 
pencils (A) for white chips (C), This task could be solved by identifying the 
composition higher order rule as adequate, retrieving the needed A-*B domain rule 
from the table, applying the higher order rule to the A-»B and B-*C rules to form 
the needed composite rule, and finally applying the composite rule to solve the 
problem. 

Subjects, design, and procedures . The subjects were 32 elementary 
school children in second through sixth grades at the Belmont Hills and Lea 
Elementary Schools in Lower Merion, Pa, and West Philadelphia, respectively. 
The experiment was conducted with individual subjects in two separate sessions, 
usually a day apart. At the end of each session the children were offered a 
balloon. 

The first session consisted of pretraining and a Transfer Pretest. The 
subject was told he was going to play a trading game with the experimenter. He 
was then taught how to interpret the rule cards and to make trades using the 
rules represented by these cards. The experimenter pointed out that each rule 
card had a name printed on the back and that the names of all of the rules were 
on the chart, but he did not indicate how the chart was used. The subject was 
shown a rule and told, for example, "This is rule M; it is on our chart. Rule 
M tells us, no matter how many paper clips I give you, you must give me the 
same number of blue chips plus one." 
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The experimenter initiated a number of trades requiring use of the 
simple rule, providing assistance where necessary until the subject reached 
a criterion of three consecutive successful trades. The experimenter then gave 
the subject a set of objects not in the domain of the rule and asked the subject, 
Can you use this rule to trade these pipe cleaners l_tox examp 1^ for blue chips: 
Regardless of the subject's response, the experimenter emphasized that the rule 
could be used only to trade paper clips for blue chips • 

The experimenter then showed the subject a different rule card and 
asked him to interpret the rule providing assistance if necessary^ Before 
moving to the next card, the subject was required to use the rule to make three 
successful trades. This process continued until the subject reached a criterion 
of three successful trades without assistance, using three different, consecu- 
tive rule cards. At this point, it was assumed that when presented with a simple 
rule card, the subject could apply the corresponding rule. 

Next the subject was taught in a similar manner how to interpret and 
use the composite rules. In the case of the composite rule shown in Figure 5, 
the subject was told, "Here is a rule for trading pencils for white chips. Tiiis 
rule says that no matter how many pencils I give you, you must give me the same 
number of paper clips plus two. Then no matter how many paper clips you have, 
you give me the same number of white chips plus one." As before, the subject 
was required to perform three consecutive correct trades with each composite 
rule. Training continued until the subject correctly interpreted three different 
consecutive composite rule cards. Then counter-examples were given. The sub- 
ject was sho\^m a composite rule anci given a set of stimulus objects not in the 
domain or was given appropriate stimulus objects and asked to trade for objects 
not in the range. Throughout the pret raining, the subjects were always told 
when they were right. 
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The subject was then given a Transfer Pretest consisting of three tasks: 
a higher order (lO task, a domain (D) task, and a composite (HD) task* The order 
of testing was li, D,HD» 

On the H task, the subject first was presented with cards representing 
a pair of compatible rules • The subject had not seen either rule before, but it 
was assumed, by virtue of his earlier training, that he knew what the cards meant. 
The subject was told that he could use the simple rule cards and that he might 
"move*' them but he was not shown how to do so. Then the subject was asked to make 
three trades requiring the use of the corresponding composite rule. (He was never 
shown this rule directly either before or after testing.) For example, a subject 
who was presented with the (A-»Br B-*C) rules "n loose leaf reinforcers n + 3 
paper clips," and '^n paper clips ~* n + 1 gummed labels" would be presented in 
turn with various numbers of reinforcers (e.g., A = 6, 8, and 5) and asked for 
the appropriate numbers of labels (C)- If he made three successful consecutive 
trades he was rated competent. If he failed on any one presentation he was rated 
incompetent and the task v/as not repeated. However, if the subject clearly applied 
the higher order rule but made an error in counting, the experimenter warned the 
subject to be very careful, and presented the task again. 

On the D task, the subject was presented with a card representing a simple 
B-*C rule (e.g., "n paper fasteners -* n + 4 rubber bands^') and a set of stimulus 
A objects not in its domain (e.g., pipe cleaners). The C (i.e., rubber bands) 
column of the table vas covered so it was not possible to find the rule converting 
A objects to C objects (i.e., pipe cleaners to rubber bands) directly. The subject 
was told, "I want to trade pipe cleaners for rubber bands. We need a pair of 
rules to let us do that. One of them is going to be this rule. Can you tell me 
what other rule I need so 1 can trade the pipe cleaners for the rubber bands?" 
The experimenter also emphasized that the subject could use the chart (but no 
training was given on it). If the subject responded correctly, the stimulus ob- 
was changed, the rule remained the same, and the task was repeated. 
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the subject responded correctly three times in a row he was rated competent • 
Otherwise he was rated incompetent on the D task. 

Finally, the subject was tested on theHD task. Ke was reminded first of 
what rules he knew or had learned up to that point. For example, if the subject 
had learned the H and D rules, the experimenter might say, *'In this problem you 
may use all the rules you have learned. You have learned to make trades. You 
have learned how to rearrange the cards to make a rule. You have learned how to 
use the chart to find a rule you need. If you need a rule you don't have, you can 
ask me for it and I will give it to you." These reminders were repeated during 
the testing if necessary. 

Then the subject was given a simple rule and a set of stimulus objects not 
in its domain, and asked to trade the stimulus objects for the output of the given 
rule. In order to succeed, the subject had to ask for the necessary card, combine 
it (perhaps mentally) with the given one, and make the trade. (During testing 
some subjects misdefined the problem and tried to trade the stimulus A objects 
for the C output of the given rule by using the given B-*C card. When this hap- 
pened, the experimenter drew the subject's attention to the fact that the stimulus 
objects were not in the domain of the given rule. If the subject asked the experi- 
menter for the wrong rule, he was allowed to choose again, if he wished.) The 
criterion for competence on theHD task was three correct trades. No reinforce- 
ment was given during the Pretest or the Posttests. 

The 24 subjects who had failed any of the Pretest tasks, participated in 
the second session during which training was given on the higher order composition 
H task and the domain D task. Twenty-two of the subjects(H-D group) were trained 
first on the H task and then given Transfer Posttest I which was identical to the 
Pretest. Next, they were trained on the domain task (i.e. , on how to use the 
chart) and given Transfer Posttest II. Later two subjects(D-H group) were trained 
first on the domain rule. On the Posttests., only the simple rules and stimulus 
}Q items were changed. All subjects were given both H and D training, even if they 
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were already competent on a corresponding H or D task. No subject was trained on 
the HD task. 

In training on the higher order (H) task, the subject was shown a pair of 
compatible (A-*B , B-»C ) rules and a set of stimulus A objects. The experimenter 
demonstrated how the rules could be combined by sliding the simple rules together 
in the appropriate manner. The subject was then asked to interpret the newly formed 
A-*Br^C rule. The and Br-»C rules were separated again and the subject was given 

a new set of stimulus A objects and asked to actually trade for C objects. This 
was repeated until the subject had successfully performed three consecutive trades 
with the given rule pair. Then new pairs of rules were introduced until the sub^ 
ject made three successful trades with three consecutive, different pairs in a row. 
Counter examples were then given. Sometimes the subject was given a compatible 
pair of rules and asked to trade an element not in either domain, or to produce an 
element not in the range of a rule. Or, the pair presented was not compatible and 
the subject had to indicate that two rules could only be combined if the output of 
one matched the input of the other. In this case, the experimenter emphasized 
that the higher order rule only applied to pairs of rules. In actually solving 
the problems the subjects were not forced to slide the simple rule cards together 
if th.ey preferred not to. 

In the domain training, the subject was given a simple H^C rule, for example, 
"n paper clips n + 2 white chips" and a set of stimulus A objects not in its do- 
main (e,g., pencils). The C (i.e., white chips) column in the table was covered- 
The subject was told, "I want to trade pencils (A) for white chips (C) . We can't 
do it just with this card so we need a pair of cards. One of them will be the 
(Br*C) card we have here /^pointin^/. Now we're going to see how we can use the 
chart to find the other (A-*B) card,'* The subject was then taught how to find the 
(A-*B) rule for trading pencils for paper clips. After a subject retrieved an a-*B 
rule by using the chart, the experimenter held the rule against the given Br*C rule 
that the subject could see that the output of the former matched the input of 
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the latter. Similar tasks involving different simple rules and stimulus objects 
were presented until the subject vas successful at finding the missing rule on 
three consecutive tasks. The experimenter always emphasized that a pair of rules 
was necessary to solve the problem, even though one of the rules was given. The 
question was always phrased, "What pair of rules do I need . . . 

Results and Discussion 

Of the 32 subjects given tiie transfer pretest, eight were successful 
on the H, D, andHDtasks. Ten of the remaining 24 failed on all three Transfer 
Pretests; the other 14 succeeded only on the H Pretest. 

After training on the H task, all of the first 22 (of the 24) subjects 
succeeded on the H task on Transfer Posttest I. Except for one of the 22 sub- 
jects who also succeeded on both the D andHDtasks, they all failed on the D 
andHDtasks. After subsequent training on the D task, all 22 subjects not only 
succeeded on the H and D tasks on Transfer Posttest II but they also succeeded 
on task HDon Transfer Posttest II. 

Of the two remaining subjects (in the D-H training group), one failed 
on all three Transfer Pretests and the other succeeded only on the H pretest. 
Both subjects succeeded on all three tasks on Transfer Posttest I, after D train- 
ing, and did so again after the subsequent H training. These results are sum- 
marized in Table 1. In the table, indicates that the subject reached cri- 
terion and that he did not. Subjects are identified according to age (8, 9, 
10, 11, 12) and sex (B, G) . 



INSERT TABLE 1 ABOUT HERE 

With the exception of two of 48 posttests, all of these results are 
consistent with the theory. Before H and D training, there was no basis for 




- 28 - 



predicting performance on the H and D tasks because there was no way of knowing 
whether or not the subject liad already mastered the requisite rules. 7he only 
restriction on the Pretest is that a subject who knows both of the higher 
order 11 and D rules diould, according to the theory, not only be successful cn 
the H and D tasks, but also on the HD task. The data support this prediction 
in fight of eight cases. 

The same pattern was obtained after the H and u training. In only one of 
22 cases did W training lead to succesc, on D Posttest I, and here the subject 
was also successful on the IID task as would be expected. Although only two 
subjects were given the D training first, tlie data suggest an overlap b^^tween 
I) training and performance on the H task (as well as on the D task). As belore, 
as well as throughout Transfer Posttest II, success on the H and D tasks was 
followed by success c.t the IID task. 

All in all, one might be tempted to report strong support for the proposed 
mechanism. This would be inappropriate, Itowever, because the* 51 training 
inadvertently was not restricted to the higher order composition rule. The 
subjects were not only taught how to generate composite rules from pairs of 
compatible simple rules, but also to use the derived composite rules lo soivc 
t! tasks. In effect, they '^ere taught both the II rule and the (idealized) 
mechanism itself, tliereby leaving unanswered the question of whether tlie 
mechanism itself is innate. This in itself, however, was not a serious problem 
since the innateness of the idealli^ed mechanism had been tested previously 
(bcandur a , 1973) . 

Tlie problem came in interpreting performance on the (ID task. our original 
intent was to determine whether training on a B^C rule, the II rule, and tlie t) 
rule was sufficient for solving an A-'C problem. Assuming that the subject is 
ipable of evaluatiiig the lower and higher level goals involved (See Scandura, 
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1973, Ch. 9 - especially pp. 287 & 294), the postulated (enriched) mechanism 
is sufficient for this purpose. Given a set of A objects and the goal of finding 
the appropriate number of C objects to trade, control would be assumed to go to 
the higher level goal consisting of rules which apply to A objects and generate 
C objects. The only available rule for accomplishing this is the higher order H 
rule. But, the H rule only operates on pairs of compatible simple rules. Hence, 
control is assumed to go to the domain goal consisting of such pairs. In this 
case, the only adequate and available rule is the higher order D rule (which applies 
in situations where a B-*C rule and A objects are available). Since the necessary 
domain elements are available, the rule is applied and a compatible pair of A-»B , 
Br'*C rules is generated. This pair satisfies the domain goal so control goes to the 
original higher level goal. This time the H rule is applied and a composite A-»B-*C 
is generated. Since the higher level goal is satisfied control goes to the original 
goal, the composite A-*B-*C rule is applied and the problem is solved. 

Unfortunately, this is not the only reasonable way to account for the HD task 
results. Because of the nature of the H training, it is just as reasonable, per- 
haps more so, to assume that the HD task was solved by composing the rules actually 
taught during the D and H training. To see this, notice that application of the 
D-rule taught during D training generates the needed A-»B , D-*C pair. Subsequent 
application of the combined H-rule and "mechanism" taught during H training, then, 
not only generates the composite A-*B-*C rule but also applies it to solve the prob- 
lem. (It is important to recognize in this regard that generating an Ar*B-*C rule 
followed by its use is not equivalent to a composition of rules.) Thus, assuming 
that a subject who has been given the H training also knows the composition H-rule 
(separate from the mechanism), which seems reasonable with the subjects used, 
success on the HD task can be explained as follows: After control goes to the 
higher level goal, the composition H rule is applied to the D rule and the combined 
H rule and mechanism forming their composition. (Notice that the two rules are 
compatible since the output of the former serves as input for uhe latter.) The 
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resulting composite rule satisfies the higher level goal so control reverts to the 
original goal, the resulting composite rule is applied, and the task solved. 

Experiment II 

Experiment II was designed to eliminate the second interpretation as a viable 
alternative. 
Method 

The materials, tasks, and procedures in Experiment II were identical tc those 
of Experiment I except in training on the higher order rule and in stating the 
subject *s goal in the domain rule training. 

Instead of training on the higher order transfer task itself (i.e., training 
on a rule for solving such tasks) , higher order rule training in Experiment II was 
limited to the higher order H rule for forming composite rules from compatible 
pairs of simple rules. During training, the subject was first nhown two compatible 
simple A-»B, Br»C rules and a set of stimulus A objects, corresponding to the inputs 
of the A-*B rule. His goal was to find an A-*Ih*C rule for trading the given A objects 
for C objects. The experimenter demonstrated how the simple rules could be com- 
bined by sliding them together in the appropriate manner. Then, the rules were 
separated and the subject was given a new set of stimulus A objects and asked to 
construct a composite rule. The subject did not perform any trades with the rule, 
as in Experiment I. This was repeated with other pairs of simple rules until the 
subject was able to form appropriate composite rules when this was possible, or to 
identify the simple rules as incompatible. In addition, the subject was sometimes 
given a pair of compatible A'-^B' , B'-*C' rules where A^^ A and C/^ C, and, if neces- 
sary, instructed why the corresponding A*-*B'-<:* rule could not be used to trade A 
objects for C objects. 

On the domain task and training, the subject was given a simple B-*C rule (e.g., 
n gummed labels n-2 rubber bands), and a set of A items (e.g., toothpicks) as 
before. Also, the C column on the chart which corresponded to the rule output (i.e., 
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rubber bands) was covered so that it was not possible to name the rule which con- 
verted A objects directly into C objects. 

The way in which the goal was stated, however, was changed. No reference was 
made to finding a pair of compatible rules for making trades > Rather, the subject's 
stated task was to find a pair of rules in which the outputs (B) of one (A-»B) were 
the same as the inputs (B) of the other (Br<;) and the inputs (A) of the first were 
identical to the given A objects and the outputs (C) of the second were identical 
to the outputs of the given B-*C rule* One of the two rules, of course, was always 
the given rule. (To help insure that the task was understood, this was explicitly 
stated only on the Pretests.) For example, given toothpicks and the rule, n gummed 
labels n-2 rubber bands, the subject had to find the Pr*B rule which converted 
toothpicks into gummed labels, and indicate that the given rule (n gummed labels 
n-2 rubber bands) was the other rule. 

During D training, the subject was shown how to locate needed rules by name 
in the row and column of the table corresponding, respectively, '*to the inputs and 
outputs of the desired rule.". (The words "input" and "output" were explained to 
the subject during the pretraining, and the subjects were taught to identify them 
on the cards.) Subjects were taught to identify the given rule as one member of 
the needed pair by giving the name on the back of the rule card. In short, the 
domain instruction involved using the chart to find compatible pairs of rules but 
references to the possible use of the rules in making trades were eliminated. 

Since the results of Experiment I suggested that D rule training may influence 
H test performance, second grade subjects (aged 7-8) were used in the D-H group 
because they would presumably be more sensitive to inadequacies in wording and 
treatments. The first four D-H subjects were trained on the D task as in Experi- 
ment I. The seven other D-H subjects were trained on this task as described above. 
In addition, there was an H-D group cons j sting. of 10 fourth graders (aged 8-10). These 

Y-rk^r>" subjects all received the modified H and D training described above. 
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One of the second graders, a 7 year old boy, was unable to complete the pre- 
training successfully, and was not included in the experimental comparison. Five 
of the younger subjects (aged 7 or 8) were unable to complete the experiment in 
two sessions. In these cases the experiment was spread out over three or more 
sessions. The length and content of the sessions varied with each subject's 
attention span and rate of progress. A typical subject might participate in four 
one-hour sessions with the first consisting of prctraining, the second of further 
pretraining and the Pretest, the third a review, domain training and Posttest I, 
and the fourth another review, higher order composition training and Posttest II. 

The mean time spent on subjects in the H-D training group was two hours 
and thirty-five minutes. The shortest time was one Iiour and fifty- five minutes. 
The longest, four hours and forty minutes. For the somewhat younger D-H group, 
the mean time per subject was three hours and forty- five minutes. The shortest 
time was two hours and twenty- five minutes; the longest, seven hours and fifty- 
five minutes. 
Results and Discussion 

The results of the H-D subjects closely paralleled those of Experiment I. 
After training on the H and D tasks, all 10 subjects on Posttest II not only 
succeeded on the II and D tasks, but on the IID task as well. Also as expected, 
H training improved performance only with those 2 subjects who failed on the H 
Pretest. In no case did H training transfer to success on the D task. 

In effect, these results clearly tend to discount the alternative explanation 
of the Experiment I data lending further support for the proposed (enriched) 
theoretical mechanism. 

The results of the younger D-H subjects, however, were less clear. in 2 or 
J cases (one was a second administration of the experiment to one subject), D 
training led not only to success on the D task, but also on the H task. Closer 
scrutiny of our experimental method during D training and testing indicated one 
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possible source of difficulty* In many cases, the experimenter inadvertently 
showed the subjects how the A-H rules, once retrieved, matched the B-'C rules. The 
very process of showing how the rules matched effectively amounted to instruction 
on how to form composite rules. It is not therefore surprising that some of the 
subjects were able to solve the H transfer tasks after D training but not before. 
It should be noted that this activity by the exp'-r imenter was not prescribed by 
the instructions provided but evolved naturally in the course of attempting to 
explain rather complex ideas to young and generally untalented children. The fact 
that this took place was determined by the analysis of the tapes of the experimental 
sessions by the author and the experimental assistant. 

There were also some additional anomalies with these young subjects that were 
observed for the first time. On four occasions (one subject twice) , subjects suc- 
ceeded on the first H task on the Pretest or Posttest but failed when the number 
of stimulus inputs to be traded was changed (indicated +?). Wliy this was so is 
not clear but it appeared likely that it was due to idiosyncratic features of the 
particular rules used and/or the subjects themselves. For example, one seven year 
old boy traded correctly on the first H presentation, but did so without moving 
the given A-»B and D-»C cards together. On the second presentation the subject ap- 
peared confused, as if he interpreted the repetition of the problem as an indica- 
tion that he had traded incorrectly on the first presentation. On the second pre- 
sentation he traded the given A objects for C objects using the D^C card, then 
traded the C objects for B objects using the A-»B cardc The stimulus objects were 
changed several times and on each presentation the subject was admonished to be 
very careful without positive effect. Also one subject succeeded on both the II 
and D tasks but failed on the HD task on Posttest I, and another subject (the one 
who went through the experiment twice) performed similarly again during a second 
administration of the experiment on the Pretest. 

If these latter results are due to other than idiosyncracies , they have im- 
O „ portant implications for the structural learning theory with unbounded capacity, 
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at least as applied to younger children. It should perhaps be noted in this regard 
that our D-H children were of the educationally underprivileged variety and had 
considerable difficulty in learning the material. Attention at times was also a 
significant problem. These factors suggest that the unlimited capacity assumption 
may not have been realized in some as yet well understood way. Among the factors 
that may have been operating are: These subjects may be unable to identify infor- 
mation even when it is readily available in the environment (i.e., they may lack 
even basic searching skills); another possibility is that the long training required 
may be indicative of the greater load placed on active memory by the higher order 
rules which had to be remembered. That is, the younger subjects may have had to 
remember them in terms of a larger number of chunks than older children, thereby 
exceeding their processing capacity. Nonetheless, rather than attempt to unravel 
this complicated set of results in the present series of experiments, I decided 

to determine first whether it would be possible to further separate D and H training. 

Experiment III 

Method 

The method used in Experiment III was identical to that used in Experiment II, 
except in the domain rule training. The stimulus situation and statement of the 
task were unchanged. After a subject had used the chart to retrieve an A-^B domain 
rule, however, the retrieved rule was never held against the given Br*C rule. The 
experimenter did, as before, draw the subject's attention to the fact that the 
input of the retrieved rule matched the stimulus A objects, and that the output of 
the retrieved rule matched the input of the given rule. But this was done in a 
manner that did not reveal how to form the composite A-^Rr^C rule. 

The scoring criteria for the H task also were modified slightly. If a subject suc- 
ceeded on the A-»B-*C problem but failed on another, he was presented with rn en- 
tirely new problem and allowed to try again- (This apparently helped to avoid the 
ambiguous scoring problem observed in Experiment II.) 
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After four + — subjects (i.e., subjects who solved the H task and failed 

the others on the pretest) and four subjects had been run, a minor change 

was made in the procedure. Since four of the subjects on the HD task attempted 
first to use the given B^C rule to trade A objects for C objects, the subjects 
were reminded just prior to the HD task that the given rule could only be used to 
trade B objects for C objects, (All subjects had received such training earlier.) 

All fifteen subjects were second and third graders from the Lea Elementary 
School between the ages of 7 and 9. All of the subjects were trained on the D 
rule first. Fourteen of the subjects required more than two sessions in order to 
complete the experiment. The average time per subject was four hours and twenty 
minutes. The shortest time taken to complete the experiment was two hours and 
fifteen minutes. The longest time taken was nine hours. 
Results and Discussion 

The results are summarized in Table 3. 



INSERT TABLE 3 ABOUT HERE 



After success on both the H and D tasks, all but one of the -f — subjects 
succeeded uniformly on the HD task. The one exception was a nine year old girl 
who failed the HD task on Posttest I and succeeded on Posttest II only after re- 
ceiving special help. On these tasks, she appeared to guess cards randomly and 
then reject them because they did not "help," On Posttest II when the experimen- 
ter finally asked, "l^Jhat would help?", she indicated that an A-^C card would (help) 
but that the C column on the chart was covered. VJhen asked, "Can it be anything 
else?'* she said she could use an A-»B card but did not proceed to look for it. 
After restating the problem, the subject began to guess cards randomly again. She 
was then asked, "What were the things [rule cards] you told me would help?" She 
then looked on the chart for the A--*B card and solved the HD problem. Although 




these results can be interpreted in several ways, it is possible that the original 
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search for an Pr*Q rule was r.uided by a prelearned selection rule (Scandura, 1973, 
Ch, 9), Subsequent rejection would tend to have imposed a greater load on working 
memory perhaps, as the above summary of events suggests, making it difficult for 
the subject to keep in mind the original goal, 

g 

Four of six subjects also performed as predicted on the HD task. The 

other two subjects, however, failed (only) on the HD task on Posttest II. One, 
a seven year old boy, seemed to have no idea of how to proceed. After a number of 
apparently random guesses, he became discouraged and unable to concentrate. The 
other, an eight yeaj. old girl, moved through the pretraining relatively quickly 
but on the HD task refused to investigate any other possibility after finding that 
she could not trade the given A objects directly for C objects. In this sense, 
she seemed like the nine year old girl mentioned above. 

General Discussion 

Overall, these results suggest that the enriched mechanism ■f)roposed is a 
common characteristic of all people. Once a subject knew, or had been taught, 
the appropriate higher order H and domain D rules, he typically not only could 
solve the corresponding higher order H (and domain) tasks but he was successful 
on the even more demanding combined HD task as well. Success on this task cannot 
easily be explained in terms of the idealized (memory free) theory but does seem 
compatible with the extended mechanism proposed. 

The latter theory also has intrinsic support. The enriched mechanism is a 
natural extension of that on which the idealized theory is based; the idealized 
theory has been tested more extensively. What goes under the rubric of memory can 
be handled by extending the basic mechanism of the idealized theory only slightly 
to allow for the generation of domain elements. Moreover, these mechanisms provide 
a highly general framework within which a wide variety of disparate phenomena can 
be viewed such as simple performance, problem solving, learning, and motivation, 
not to mention memory (for details, see Scandura, 19 73). There is also some 
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evidence to suggest that essentially the same theory can be extended to deal with 

perceptual and developmental phenomena as well (Scandura, 1973, Ch. 5), The basic 

mechanisms of the theory appear to be at least as simple as those required for most 

existing memory theories and, yet, the theory potentially has greater generality. 

Although the basic mechanism appears generally compatible with existing 

memory data, a major limitation is that no serious attempt has been made to date 

9 

to make direct contact with a large body of memory research. Nonetheless, in 
accord with known facts, for example, it follows directly that degree of recall 
should depend on the extent to which the test conditions reinstate the stimulus 
conditions during storage. According to Scandura (1973), however, the distinc- 
tion between goals and stimuli provides a basis for making finer experimental 
distinctions with complex materials than for the most part has been possible to 
date. According to the theoretical mechanism proposed, it also is immediately 
obvious why a rule that has been used in the immediate past is more likely to be 
used than Fome alternative, even where as in Einstellung experiments, the alter- 
native would otherwise be preferable. Rules in A are applied before rules which 
must be derived from rules in A. Similar general comments can be made concerning 
retroactive inhibition and reminiscence, as the theory clearly provides a basis 
for learning between storage and retrieval. In general, this new learning may 
either interfere with or facilitate retention. 

A majcr limitation of this research, however, is that it does not cake 
into account what might, happen in those cases where a subject is unable to use 
provided information in the environment effectively or where his capacity to pro- 
cess information is exceeded. 
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Fixed Processing Capacity 
The fixed capacity theory follow'^ Miller (1956) in assuming that A 
contains 7+2 "chunks" of information, only here the chunks may be rules as 
well as elements on which rules operate. The mechanisms of the memory theory 
with unlimited processing capacity are limited in this regard in that they can 
serve only to make more rules active. In the fixed capacity theory^ mechanisms 
are also needed to explain how information is deactivated. Although little 
relevant data are available at this time, it would appear that there are two 
basic ways in which deactivation might enter the theory: (1) by modifying the 
basic mechanisms so as to allow for deactivation of goals and rules as v/ell as 
their activation, and (2) by modifying the rule notion itself so that elements 
may be "erased" as well as generated. The basic constraint in either case, ac- 
cording to the theory, is the fixed finite capacity of A. 

Roughly speaking, Scandura (1973) proposed the hypothesis that goals are 
also included in A and are deactivated during a learning episode when they be- 
come no longer useful. For example, it was assumed that any initial goal would 
remain in A throughout the course of a derivation because it is always the last 
as well as the first goal in control. Higher level goals, however, are discharged 
from A as control reverts to lower levels. Retaining them in A after this serves 
no critical purpose. 
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Specifying how goals become active and are deactivated still leaves open 
what happens when a rule becomes overloaded in the course of a computation. 
In this case, Scandura (1973) rejected such universal assumptions as: The 
element in A that is processed first is always dropped first, and proposed 
instead to add more structure to rules so that they might serve not only to 
activate (generate) new elements during the course of a computation, but also 
to deactivate (erase) others. In effect, it was hypothesized that rules might 
specify not only what is to be done at each stage of a computation but also 
where each generated element Is to be located in A. The placement of a new 
element in a given location is assumed to erase present contents, much as is 
the case in abstract automata (Nelson, 1968). 

Ln short, it was proposed in a computation not only that elements can be 
generated, and thereby added to A, but also that they can be erased in a speci- 
fied manner. A similar principle of erasure was assumed to apply to the shifting 
of control among goal levels. Elements, whether they be simple elements, rules, 
or goals, are assumed to remain in A if and only if they are needed to determine 
either a future output or an operation that must be performed sometime in the 

future. For example, let A == fS , r , r , o, G], where S is a stimulus, r 
^ * ^ o* n* m* * * o * n 

and r^ are conpatible rules, o is composition and G is the goal. Assume further 

that r or (S ) satisfies G but not r (S ) or r (S ). In this case, control 
n m o n o m o 

2 2 
shifts to G so that now A=fS, r, r, o, G,G}. If A becomes overloaded 

o n* m' * * 

at this point, something crucial must be erased, but the theory does not specify 

what. Here, o is applied to (r , r ) generating r or . This time, however, 
* ^ n* m^ ^ ^ m n * * 

instead of iust adding r or to A, r and r may be erased, as they are no longer 

mn nm^ 

2 

needed. Similarly, once control reverts to G, G is erased, leaving "more space" 
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for the application of r or . 

m n 

The theory also includes explicit procedures for determining, in an 
analytical manner, the memory load imposed by individual processes (rules) as 
applied to particular instances. Data collected by Voorhies and Scandura 
(some of which is reported in Scandura, 1973) supports the viability of this 
method. These data are consistent with the notion that each subject has a 
fixed finite capacity for processing information. Although processing efficiency 
depends on the rule used - an extension of Miller's (1956) finding, the basic, 
physiologically determined processing capacity remains fixed. 

Incidentally, the theory treats rehearsal as any other procedure. The 
data obtained by Voorhies and Scandura strongly suggest that rehearsal in and 
of itself has no effect on retention. Unless precautions are taken (e.g., 
Scandura, 1973 ; Dalrymple-Alford, 1967), however, rehearsal provides opportunit ie*3 
for chunking and thereby may give the appearance of improving retention. It 
should be noted in this regard, that "chunking*' so-called involves processes 
over and above rehearsal itself, and strictly speaking is not the same (rule) 
as pure rehearsal. 
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The Structural Theory of Memory and 
State Space Formulations - Contrasted 

There are at least four major differences between the two formulations. 
First, competence in the structural learning theory consists of a finite set of 
discrete rules. There is no structure to the set of rules itself. The struc- 
ture, if it can called that, is imposed by the fixed manner in which the 
rules are allowed to interact. In state space theories, on the other hand, 
competence corresponds to a highly structured, fixed network. 

Generative grammars provide a convenient way of conceptualizing the re- 
lationship between structural competence and state space formulations. A 
generative grammar, recall, also consists of a set of rules, but these rules 
may interact only in a very special way. Namely, they must be applied in se- 
quence to successive outputs. State spaces provide a convenient way of represent- 
ing the possible ways in which the rules (operators) in a generative grammar may 
be combined. State spaces are not adequate for representing structural compe- 
tence because rules may be combined and otherwise modified in ways quite different 
from simple composition. 

In effect, it would appear that competence in the structural learning 
theory is both more general and more constrained. It is more general because 
of the great variety of higher order rules which are possible. It is more con- 
strained in that the rules are designed to reflect the knowledge had by a given 
culture or population of subjects, rather than to represent all possibilities. 

Unfortunately, the question of how to actually construct a structural com- 
petence theory or a state space has barely been touched, even in formal treatments 
within computer science. In the latter sphere, for example, the selection of a 
state space has important Implications for the search effort required to achieve 
goals (or retrieve information) . Some progress has been made in the problem of 
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description of states and operators (Amarel, 1968) but the processes by 

which "good" state spaces are devised are very poorly understood- Similarly, 
in the structural learning theory, we know that the rules must reflect the 
culture of the population of the subjects in question. Other than general 
statements to this effect, however, relatively little is known about the speci- 
fic relationships which must exist between particular populations and rules. 

Another facet of problem formulation is handled quite differently in the 
two formulations - namely, that of forming sub-goals. To date this question 
seems to have been considered only in state space models, and there only in 
problem solving (see Newell & Simon, 1972), In state space theories, sub^goals 
are represented in the state space itself, by means of what are called AND/OR 
graphs (state spaces) (see Nilsson, 1971), In the structural learning formula- 
tion, sub-goals are hypothesized to result from the way in which problems are 
interpreted ^'.iee Scandura, 1973, p, 348), Presenting a subject with a problem 
statement, for example, is almost universally understood to mean tiiat the sub- 
ject first is to define the problem - interpret the statement (sub-goal one) , and 
then to solve it (sub-goal two). Defining the problem may involve generating 
a series of sub-goals, each of which presumably defines a task to be dealt with 
according to the mechanisms described above. Perhaps surprisingly, interpreta- 
tion (assigning meanings) in the structural learning theory seems not to require 
any new mechanisms (see Scandura, 1973, Ch , 7), There is, however, as yet rela^ 
tively little data relating to this hypothesis. 

The second major difference concerns the distinction between competence 
and knowledge in the structural learning theory. This distinction, in which 
the knowledge had by individual subjects is defined in terms of competence and 
the subjects' behavior, has important implica:ions for individual differences. 
In contrast to the finite, systematic testing procedure provided in the structural 
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learning theory, the only way individual differences can be treated in existing 
state space theories is by devising separate state spaces and processes for 
different individuals. Indeed, the distinction between the structure of input 
information (competence) and the structure of the subject's knowledge has barely 
been considered (cf , Frederickson , 1972), 

Third, turning to learning, we see in the structural learning theory that 
knowledge acquisition appears to take place according to a simple, very specific 
mechanism in which control shifts among initial and higher level goals in a 
predetermined (fixed) manner, a manner assumed no be characteristic of all people. 
Although we did not attempt to summarize all of his arguments ,• ScandurA (1973) 
has shown how this one mechanism also deals with motivation, storage and re- 
trieval from memory, and interpretative processes by which meanings are assigned. 
In state space formulations, on the other hand, learning involves transforming 
given state spaces, represented for example by tagging states and/or operators 
or by adding new elements* In contrast to learning, retrieval involves searching 
through a state space* Motivation has hardly been considered within this framework. 

It may be noted parenthetically that, where only part of the relevant 
knowledge known to a subject is available, a somewhat more general formulation 
is required. Ignoring processing capacity for the moment, the structural learning 
theory allows for retrieval (generation) of needed information, including genera- 
tion of rules (elements) in domains of available rules as well as rules them- 
selves. In state space theories, this corresponds to the commonly assumed situ- 
ation where only a selected few of the nodes (states) may serve as starting 
locations , 

Fourth, where processing capacity is fixed, as it is both in the "enriched" 
structural learning theory and in most current information processing theories, 
specific allowance must be made for erasure of elements from active status, as 
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well as for the generation (activation) of new elements. In state space for- 
mulations the processes by wlilch elements are erased and added have a proba- 
bilistic and/or arbitrary character. In some theories capacity is assumed to 
relate primarily to the state spaces themselves. In the Anderson-Bower (1972) 
theory, for example, admittedly arbitrary characteristics of spaces are used to 
decide which items (old or new) arc to remain active when capacity is exceeded. 
In others (e.g., Rumelhart , Lindsay 6c Norman, 1972; Newell & Simon, 1972) 
capacity relates primarily to processes. Rumelhart e_t aj.. (1972), for example, 
assume a fixed mechanism which recodes active information whenever capacity 
is reached. In the structural learning formul^ion, this would correspond 
to the generation of a new processing procedure (rule). Although probability 
(or at least nondeterminism) enters the structural learning theory at this 
level for the first time (e.g., with computations involving given rules), certain 
general constraints relating to tlie goal switching mechanism are assumed to govern 
the erasure of information from active store. 

In sum, the structural learning theory appears to have greater generality 
and parsimony. Critical parts of the theory have also withstood rather demanding 
empirical test. The situation regarding heuristic power in generating research 
is inconclusive at present, since both formulations appear pregnant in this 
sense. It is basically a case of competing paradigms (Kuhn, 1970). 

It should be noted, however, that very little work has been done to date 
in applying the structural learning iheory to natural language. The reasons 
are several, not the least of which is my shared belief (cf. Greeno, 1972) that 
rnore progress can be made, at least initially, by attacking less ambiguous kinds 
of knowledge before moving ahead pell-mell into the man-made morass called 
"natural language, " 
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Relationships to Heterarchical Systems in Artificial Intelligence 
In this section we comment briefly on the relationship between the structural 
learning mechanism, and the notion of heterarchical control in systems of arti- 
ficial intelligence (Minsky & Papert, 1972). 

For a time artificial intelligence systems were viewed as wholes, as 
frequently complex programs. As work in the area progressed, the difficulties 
of building upon earlier work, and even of making changes in existing systems, 
become increasingly clear because of the close interrelationships among various 
parts of such systems. To overcome this limitation, heterarchical, or modular 
planning has been used (e.g., Winograd, 1971, Winston, 1970, Charniak, 1972). 
Heterarchical systems consist of sets of programs (modules) pertaining to syntax, 
semantics, line detection, and so on, together with an heterarchical executive 
which switches control among these "modules" in accordance with a predetermined 
plan. At the present time, the MIT group is planning ways of enriching the 
heterarchical control systems they have developed to date to allow for more 
flexibility (Winograd, personal communication). 

It should be apparent that modules in heterarchical systems correspond 
essentially to rules in the structural learning theory; the executive control 
structure corresponds to the basic mechanism. There is, however, an important 
difference between the two. In heterarchical systems, the basic goal is prag- 
matic. Such systems make it easier to modify and to build upon previous work. 
No one seriously means to imply that heterarchical control reflects the way 
people perform, although in developing artificial intelligence systems intuitive 
judgments are sometimes made with this in mind. 

Although any rule system conforming to the structural learning mechanism 
can be simulated with (in fact is) a heterarchical system and vice versa, this 
is not the main point. The structural learning mechanism is assumed to be built 
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into people (presumably from birth) ; it is not learned and need not be taught . 
While the rules a person knows may increase from time to time, the mechanism 
is assumed to remain constant. 

This is a strong claim, something which no responsible person would make 
concerning executive systems currently used in heterarchical systems. Among 
other things, it is very unlikely that an existing control system would be useful 
in systems other than the one for which it was designed. It is my contention 
that benefits might accrue in artificial intelligence and, of course, in simu- 
lation if structural learning like control structures were used. 

Conclusions 

By way of summary, let us return to the questions with which we began. 
In a theory of memory, what parts should be fixed? What parts should be flexible? 

It would appear from the structural learning analysis, that while certain 
portions of cognitive theory seem to be fixed, much more appears to be flexible. 
Furthermore, the question of what is fixed and what is flexible enters in a num- 
ber of different ways. Competence theories, for example, are fixed , at least 
for given populations and particular content. Knowledg e , however, is flexible . 
It varies over individuals, although there are specific methods for determining 
knowledge from individual behavior and a fixed competence theory. 

According to our analysis, the mechanism by which knowledge is selected, 
put to use, and acquired, also appears to be fixed . Finally, it would appear 
that each person has a _f ixe d_ finite capacity for processing information, a capa- 
city rooted not in the processes used but in the physiological cliaracter of man. 
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FOOTNOTES 

1. Statf? spaces (tree diagrams) are also widely used in analyzing 
the iiierarchical structure of subject matters (e.g., Gagne, 1962). Such 
structures correspond to levels of refinement of rules in the structural 
learning theory (for details, see Scandura, 1973, Chapter 5). 

2. The model also allows for learning during recall test trials 
but this complication need not concern us here. 

3. The authors also define higher order nodes which, analogous to 
Gagne 's (1965) use of the term, are concatenations of other nodes, not to 
be confused with higher order rules in the structural learning theory 
(Scandura, 1973). The latter operate on classes of rules and may, for 
example, generate composite (concatenations of) rules. 

4. Even though knowledge is alwayj: defined in terms of the rules in 

a predetermined competence theory, it must not be thought that such knowledge 
is arbitrary. If two or more rules cf competence each provide a consistent 
basis for assessing behavior potential (i.e., if performance on the respec- 
tive equivalence classes is homogeneous), then the respective (sub)ru]e s 
used to c?iaracterize knowledge are necessarily equivalent. Furthermore, 
any viable competence theory in this view must be capable of withstanding 
behavioral test (Scandura, 1972). Competence and knowledge are analogous tc the 
chicken and the egg insofar as priority is concerned. 

5. In actuality, this mechanism is oversimplified. For details 
concerning an enriched mechanism which deals with rule selection (where 
two or more rules apply), and which allows for false starts (i.e., back- 
tracking), see Scrndura (1973, Chapter 9). 
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6. There was reason to believe in the one deviant case that the conditions 
of the experiment had not been adequately fulfilled* The subject was run through 
the same experiment a week later, using different rules, this time with positive 
results. 

?• In order not to mislead the reader, it should be mentioned that this 
'*first" attempt came after a considerable amount of preliminary pilot work. 

8. The results of a seventh subject were difficult to interpret 

because the D test on Posttest I indicated that the D training had not been 
effective. After Posttest I the subject was retrained on the D rule and Posttest 
I was repeated. Then he was given theH training and Posttest II. In both cases, 
he succeeded on all three tasks. 

9. According to Scandura (1973), the main reason that this has not been 
done is because the theory seems to call for different kinds of data. The theory 
does not seem to provide any compelling insights into free recall, for example. 
Its major advantages seem to lie in the analysis of memory of more meaningful 
knowledge which can be readily and unambiguously represented in terms of rules, 

10. Srancura (1973) made no attempt to deal with the question of processing 
time, deferring here to ongoing research in the area (e.g., Sternberg, 1969). 
Scandura's position was that processing time may ultimately be traced to certain 
physiologically based behavior constants of individual subjects, in the same 
sense that the processing capacity of individual subjects is fixed. 
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FIGURE CAPTIONS 

Figure 1. The directed graph represents a state space in whicli 
tile nodes represent states and the arrows represent operators or relations. 
S denotes the starting state and the G denote goals. 

Figure 2. This figure shows a portion of the state space for 
DONALD GERALD ROBERT illustrating a simple heuristic search. Once 
a digit has 1)0 en assigned to one letter (e.g., 4 to l), it cannot 
be assis;ned to other letters thereby reducing the search. 

Fiuurt' 1. Tlic directed {graphs labelled I, 2, 3, and 4 represent 
the four paths through tlie indicated procedure for generating the 
"next" numeral in Base Tlireo Arithmetic. The sample S-R pairs belong 
to the tour equivalence classes defined by the patlis. 

Figure 4A and 4B. Schematic representations of memory in the 
idealized theory of structural learning (4A) and in the "enriched" theory 
of memory (4B) . 

Figure 5. Samples of simple and composite rule cards. 
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